Hadoop Perfect File: A fast and memory-efficient metadata access archive file to face small files problem in HDFS

نویسندگان

چکیده

HDFS faces several issues when it comes to handling a large number of small files. These are well addressed by archive systems, which combine files into larger ones. They use index hold relevant information for retrieving file content from the big file. However, existing archive-based solutions require significant overheads since additional processing and I/Os needed acquire retrieval before accessing actual content, therefore, deteriorating access efficiency. This paper presents new named Hadoop Perfect File (HPF). HPF minimizes directly metadata part containing information. It consequently reduces improves efficiency Our system uses two hash functions. Metadata records distributed across using dynamic function. We further build an order-preserving perfect function that memorizes position file's record within

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering Files with Extended File Attributes in Metadata

Classification and searching play an important role in modern file systems and file clustering is an effective approach to do this. This paper presents a new labeling system by making use of the Extended File Attributes [1] of file system, and a simple file clustering algorithm based on this labeling system is also introduced. By regarding attributes and attribute-value pairs as labels of files...

متن کامل

Comparing Hadoop and Fat-Btree Based Access Method for Small File I/O Applications

Hadoop has been widely used in various clusters to build scalable and high performance distributed file systems. However, Hadoop distributed file system (HDFS) is designed for large file management. In case of small files applications, those metadata requests will flood the network and consume most of the memory in Namenode thus sharply hinders its performance. Therefore, many web applications ...

متن کامل

Hmfs: Efficient Support of Small Files Processing over HDFS

The storage and access of massive small files are one of the challenges in the design of distributed file system. Hadoop distributed file system (HDFS) is primarily designed for reliable storage and fast access of very big files while it suffers a performance penalty with increasing number of small files. A middleware called Hmfs is proposed in this paper to improve the efficiency of storing an...

متن کامل

Ex Vivo Comparison of File Fracture and File Deformation in Canals with Moderate Curvature: Neolix Rotary System versus Manual K-files

Background and Aim: Cleaning and shaping is one of the important steps in endodontic treatment, which has an important role in root canal treatment outcome. This study evaluated the rate of file fracture and file deformation in Neolix rotary system and K-files in shaping of the mesiobuccal canal of maxillary first molars with moderate curvature.    Materials and Methods: In this ex vivo exp...

متن کامل

Fast File Access for Fast Agents

Mobile agents are a powerful tool for coordinating general purpose distributed computing, where the main goal is high performance. In this paper we demonstrate how the inherent mobility of agents may be exploited to achieve fast file access, which is necessary for most general-purpose applications. We present a file system for mobile agents based exclusively on local disks of the participating ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of Parallel and Distributed Computing

سال: 2021

ISSN: ['1096-0848', '0743-7315']

DOI: https://doi.org/10.1016/j.jpdc.2021.05.011